A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network
نویسنده
چکیده
The quality of a text-to-speech (TTS) system heavily depends on the transcription quality of the words to be spoken. Obviously the best transcription can be found in a phonetic dictionary. But for out of vocabulary (OOV) words fall back routines have to be developed. This paper proposes a fall back routine that combines the correctness of a phonetic dictionary with the exibility of a neural network. In the rst step parts of the OOV word are looked up in the dictionary. They are then connected with the additional feature that the last phoneme of the rst part is re-estimated using a neural network and a special phonetic dictionary. In the second step the word stress is determined either from the dictionary or using a second neural network.
منابع مشابه
Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach
To achieve high quality output speech synthesis systems, data-driven grapheme-to-phoneme (G2P) conversion is usually used to generate the phonetic transcription of out-of-vocabulary (OOV) words. To improve the performance of G2P conversion, this paper deals with the problem of conflicting phonemes, where an input grapheme can, in the same context, produce many possible output phonemes at the sa...
متن کاملA Hybrid Approach to Grapheme-Phoneme Conversion
We present a simple and effective approach to the task of grapheme-tophoneme conversion based on a set of manually edited grapheme-phoneme mappings which drives not only the alignment of words and corresponding pronunciations, but also the segmentation of words during model training and application, respectively. The actual conversion is performed with the help of a conditional random field mod...
متن کاملبهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگیهای استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز
The design for new feature extraction methods out of the speech signal and combination of their obtained information is one of the most effective approaches to improve the performance of automatic speech recognition (ASR) system. Recent researches have been shown that the speech signal contains nonlinear and chaotic properties, but the effects of these properties are not used in the continuous ...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملHybrid approach to grapheme to phoneme conversion for Korean
In the grapheme to phoneme conversion problem for Korean, two main approaches have been discussed: knowledge-based and data-driven methods. However, both camps have limitations: the knowledge-based hand-written rules cannot handle some of the pronunciation changes due to the lack of capability of linguistic analyzers and many exceptions; data-driven methods always suffer from data sparseness. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000